18 research outputs found

    ClassyFire: automated chemical classification with a comprehensive, computable taxonomy

    Get PDF
    Additional file 5. Use cases. Text-based search on the ClassyFire web server. (A) Building the query. (B) Sparteine, one of the returned compounds

    Cheminformatics and artificial intelligence for accelerating agrochemical discovery

    Get PDF
    The global cost-benefit analysis of pesticide use during the last 30 years has been characterized by a significant increase during the period from 1990 to 2007 followed by a decline. This observation can be attributed to several factors including, but not limited to, pest resistance, lack of novelty with respect to modes of action or classes of chemistry, and regulatory action. Due to current and projected increases of the global population, it is evident that the demand for food, and consequently, the usage of pesticides to improve yields will increase. Addressing these challenges and needs while promoting new crop protection agents through an increasingly stringent regulatory landscape requires the development and integration of infrastructures for innovative, cost- and time-effective discovery and development of novel and sustainable molecules. Significant advances in artificial intelligence (AI) and cheminformatics over the last two decades have improved the decision-making power of research scientists in the discovery of bioactive molecules. AI- and cheminformatics-driven molecule discovery offers the opportunity of moving experiments from the greenhouse to a virtual environment where thousands to billions of molecules can be investigated at a rapid pace, providing unbiased hypothesis for lead generation, optimization, and effective suggestions for compound synthesis and testing. To date, this is illustrated to a far lesser extent in the publicly available agrochemical research literature compared to drug discovery. In this review, we provide an overview of the crop protection discovery pipeline and how traditional, cheminformatics, and AI technologies can help to address the needs and challenges of agrochemical discovery towards rapidly developing novel and more sustainable products

    Computational applications in secondary metabolite discovery (caismd): An online workshop

    Get PDF
    We report the major conclusions of the online open-access workshop “Computational Applications in Secondary Metabolite Discovery (CAiSMD)” that took place from 08 to 10 March 2021. Invited speakers from academia and industry and about 200 registered participants from fve continents (Africa, Asia, Europe, South America, and North America) took part in the workshop. The workshop highlighted the potential applications of computational meth‑ odologies in the search for secondary metabolites (SMs) or natural products (NPs) as potential drugs and drug leads. During 3 days, the participants of this online workshop received an overview of modern computer-based approaches for exploring NP discovery in the “omics” age. The invited experts gave keynote lectures, trained participants in handson sessions, and held round table discussions. This was followed by oral presentations with much interaction between the speakers and the audience. Selected applicants (early-career scientists) were ofered the opportunity to give oral presentations (15 min) and present posters in the form of fash presentations (5 min) upon submission of an abstract. The fnal program available on the workshop website (https://caismd.indiayouth.info/) comprised of 4 keynote lec‑ tures (KLs), 12 oral presentations (OPs), 2 round table discussions (RTDs), and 5 hands-on sessions (HSs). This meeting report also references internet resources for computational biology in the area of secondary metabolites that are of use outside of the workshop areas and will constitute a long-term valuable source for the community. The workshop concluded with an online survey form to be completed by speakers and participants for the goal of improving any subsequent editions

    The NORMAN Suspect List Exchange (NORMAN-SLE): facilitating European and worldwide collaboration on suspect screening in high resolution mass spectrometry

    Get PDF
    Background: The NORMAN Association (https://www.norman-.network.com/) initiated the NORMAN Suspect List Exchange (NORMAN-SLE; https://www.norman-.network.com/nds/SLE/) in 2015, following the NORMAN collaborative trial on non-target screening of environmental water samples by mass spectrometry. Since then, this exchange of information on chemicals that are expected to occur in the environment, along with the accompanying expert knowledge and references, has become a valuable knowledge base for "suspect screening" lists. The NORMAN-SLE now serves as a FAIR (Findable, Accessible, Interoperable, Reusable) chemical information resource worldwide.Results: The NORMAN-SLE contains 99 separate suspect list collections (as of May 2022) from over 70 contributors around the world, totalling over 100,000 unique substances. The substance classes include per- and polyfluoroalkyl substances (PFAS), pharmaceuticals, pesticides, natural toxins, high production volume substances covered under the European REACH regulation (EC: 1272/2008), priority contaminants of emerging concern (CECs) and regulatory lists from NORMAN partners. Several lists focus on transformation products (TPs) and complex features detected in the environment with various levels of provenance and structural information. Each list is available for separate download. The merged, curated collection is also available as the NORMAN Substance Database (NORMAN SusDat). Both the NORMAN-SLE and NORMAN SusDat are integrated within the NORMAN Database System (NDS). The individual NORMAN-SLE lists receive digital object identifiers (DOIs) and traceable versioning via a Zenodo community (https:// zenodo.org/communities/norman-.sle), with a total of > 40,000 unique views, > 50,000 unique downloads and 40 citations (May 2022). NORMAN-SLE content is progressively integrated into large open chemical databases such as PubChem (https://pubchem.ncbi.nlm.nih.gov/) and the US EPA's CompTox Chemicals Dashboard (https://comptox. epa.gov/dashboard/), enabling further access to these lists, along with the additional functionality and calculated properties these resources offer. PubChem has also integrated significant annotation content from the NORMAN-SLE, including a classification browser (https://pubchem.ncbi.nlm.nih.gov/classification/#hid=101).Conclusions: The NORMAN-SLE offers a specialized service for hosting suspect screening lists of relevance for the environmental community in an open, FAIR manner that allows integration with other major chemical resources. These efforts foster the exchange of information between scientists and regulators, supporting the paradigm shift to the "one substance, one assessment" approach. New submissions are welcome via the contacts provided on the NORMAN-SLE website (https://www.norman-.network.com/nds/SLE/)

    The NORMAN Suspect List Exchange (NORMAN-SLE): Facilitating European and worldwide collaboration on suspect screening in high resolution mass spectrometry

    Get PDF
    Background: The NORMAN Association (https://www.norman-network.com/) initiated the NORMAN Suspect List Exchange (NORMAN-SLE; https://www.norman-network.com/nds/SLE/) in 2015, following the NORMAN collaborative trial on non-target screening of environmental water samples by mass spectrometry. Since then, this exchange of information on chemicals that are expected to occur in the environment, along with the accompanying expert knowledge and references, has become a valuable knowledge base for “suspect screening” lists. The NORMAN-SLE now serves as a FAIR (Findable, Accessible, Interoperable, Reusable) chemical information resource worldwide. Results: The NORMAN-SLE contains 99 separate suspect list collections (as of May 2022) from over 70 contributors around the world, totalling over 100,000 unique substances. The substance classes include per- and polyfluoroalkyl substances (PFAS), pharmaceuticals, pesticides, natural toxins, high production volume substances covered under the European REACH regulation (EC: 1272/2008), priority contaminants of emerging concern (CECs) and regulatory lists from NORMAN partners. Several lists focus on transformation products (TPs) and complex features detected in the environment with various levels of provenance and structural information. Each list is available for separate download. The merged, curated collection is also available as the NORMAN Substance Database (NORMAN SusDat). Both the NORMAN-SLE and NORMAN SusDat are integrated within the NORMAN Database System (NDS). The individual NORMAN-SLE lists receive digital object identifiers (DOIs) and traceable versioning via a Zenodo community (https://zenodo.org/communities/norman-sle), with a total of > 40,000 unique views, > 50,000 unique downloads and 40 citations (May 2022). NORMAN-SLE content is progressively integrated into large open chemical databases such as PubChem (https://pubchem.ncbi.nlm.nih.gov/) and the US EPA’s CompTox Chemicals Dashboard (https://comptox.epa.gov/dashboard/), enabling further access to these lists, along with the additional functionality and calculated properties these resources offer. PubChem has also integrated significant annotation content from the NORMAN-SLE, including a classification browser (https://pubchem.ncbi.nlm.nih.gov/classification/#hid=101). Conclusions: The NORMAN-SLE offers a specialized service for hosting suspect screening lists of relevance for the environmental community in an open, FAIR manner that allows integration with other major chemical resources. These efforts foster the exchange of information between scientists and regulators, supporting the paradigm shift to the “one substance, one assessment” approach. New submissions are welcome via the contacts provided on the NORMAN-SLE website (https://www.norman-network.com/nds/SLE/)

    The NORMAN Suspect List Exchange (NORMAN-SLE): facilitating European and worldwide collaboration on suspect screening in high resolution mass spectrometry

    Get PDF
    The NORMAN Association (https://www.norman-network.com/) initiated the NORMAN Suspect List Exchange (NORMAN-SLE; https://www.norman-network.com/nds/SLE/) in 2015, following the NORMAN collaborative trial on non-target screening of environmental water samples by mass spectrometry. Since then, this exchange of information on chemicals that are expected to occur in the environment, along with the accompanying expert knowledge and references, has become a valuable knowledge base for "suspect screening" lists. The NORMAN-SLE now serves as a FAIR (Findable, Accessible, Interoperable, Reusable) chemical information resource worldwide.The NORMAN-SLE project has received funding from the NORMAN Association via its joint proposal of activities. HMT and ELS are supported by the Luxembourg National Research Fund (FNR) for project A18/BM/12341006. ELS, PC, SEH, HPHA, ZW acknowledge funding from the European Union’s Horizon 2020 research and innovation programme under grant agreement No 101036756, project ZeroPM: Zero pollution of persistent, mobile substances. The work of EEB, TC, QL, BAS, PAT, and JZ was supported by the National Center for Biotechnology Information of the National Library of Medicine (NLM), National Institutes of Health (NIH). JOB is the recipient of an NHMRC Emerging Leadership Fellowship (EL1 2009209). KVT and JOB acknowledge the support of the Australian Research Council (DP190102476). The Queensland Alliance for Environmental Health Sciences, The University of Queensland, gratefully acknowledges the financial support of the Queensland Department of Health. NR is supported by a Miguel Servet contract (CP19/00060) from the Instituto de Salud Carlos III, co-financed by the European Union through Fondo Europeo de Desarrollo Regional (FEDER). MM and TR gratefully acknowledge financial support by the German Ministry for Education and Research (BMBF, Bonn) through the project “Persistente mobile organische Chemikalien in der aquatischen Umwelt (PROTECT)” (FKz: 02WRS1495 A/B/E). LiB acknowledges funding through a Research Foundation Flanders (FWO) fellowship (11G1821N). JAP and JMcL acknowledge financial support from the NIH for CCSCompendium (S50 CCSCOMPEND) via grants NIH NIGMS R01GM092218 and NIH NCI 1R03CA222452-01, as well as the Vanderbilt Chemical Biology Interface training program (5T32GM065086-16), plus use of resources of the Center for Innovative Technology (CIT) at Vanderbilt University. TJ was (partly) supported by the Dutch Research Council (NWO), project number 15747. UFZ (TS, MaK, WB) received funding from SOLUTIONS project (European Union’s Seventh Framework Programme for research, technological development and demonstration under Grant Agreement No. 603437). TS, MaK, WB, JPA, RCHV, JJV, JeM and MHL acknowledge HBM4EU (European Union’s Horizon 2020 research and innovation programme under the grant agreement no. 733032). TS acknowledges funding from NFDI4Chem—Chemistry Consortium in the NFDI (supported by the DFG under project number 441958208). TS, MaK, WB and EMLJ acknowledge NaToxAq (European Union’s Horizon 2020 research and innovation programme under the Marie Sklodowska-Curie Grant Agreement No. 722493). S36 and S63 (HPHA, SEH, MN, IS) were funded by the German Federal Ministry for the Environment, Nature Conservation and Nuclear Safety (BMU) Project No. (FKZ) 3716 67 416 0, updates to S36 (HPHA, SEH, MN, IS) by the German Federal Ministry for the Environment, Nature Conservation, Nuclear Safety and Consumer Protection (BMUV) Project No. (FKZ) 3719 65 408 0. MiK acknowledges financial support from the EU Cohesion Funds within the project Monitoring and assessment of water body status (No. 310011A366 Phase III). The work related to S60 and S82 was funded by the Swiss Federal Office for the Environment (FOEN), KK and JH acknowledge the input of Kathrin Fenner’s group (Eawag) in compiling transformation products from European pesticides registration dossiers. DSW and YDF were supported by the Canadian Institutes of Health Research and Genome Canada. The work related to S49, S48 and S77 was funded by the MAVA foundation; for S77 also the Valery Foundation (KG, JaM, BG). DML acknowledges National Science Foundation Grant RUI-1306074. YL acknowledges the National Natural Science Foundation of China (Grant No. 22193051 and 21906177), and the Chinese Postdoctoral Science Foundation (Grant No. 2019M650863). WLC acknowledges research project 108C002871 supported by the Environmental Protection Administration, Executive Yuan, R.O.C. Taiwan (Taiwan EPA). JG acknowledges funding from the Swiss Federal Office for the Environment. AJW was funded by the U.S. Environmental Protection Agency. LuB, AC and FH acknowledge the financial support of the Generalitat Valenciana (Research Group of Excellence, Prometeo 2019/040). KN (S89) acknowledges the PhD fellowship through Marie Skłodowska-Curie grant agreement No. 859891 (MSCA-ETN). Exposome-Explorer (S34) was funded by the European Commission projects EXPOsOMICS FP7-KBBE-2012 [308610]; NutriTech FP7-KBBE-2011-5 [289511]; Joint Programming Initiative FOODBALL 2014–17. CP acknowledges grant RYC2020-028901-I funded by MCIN/AEI/1.0.13039/501100011033 and “ESF investing in your future”, and August T Larsson Guest Researcher Programme from the Swedish University of Agricultural Sciences. The work of ML, MaSe, SG, TL and WS creating and filling the STOFF-IDENT database (S2) mostly sponsored by the German Federal Ministry of Education and Research within the RiSKWa program (funding codes 02WRS1273 and 02WRS1354). XT acknowledges The National Food Institute, Technical University of Denmark. MaSch acknowledges funding by the RECETOX research infrastructure (the Czech Ministry of Education, Youth and Sports, LM2018121), the CETOCOEN PLUS project (CZ.02.1.01/0.0/0.0/15_003/0000469), and the CETOCOEN EXCELLENCE Teaming 2 project supported by the Czech ministry of Education, Youth and Sports (No CZ.02.1.01/0.0/0.0/17_043/0009632).Peer reviewe

    CypReact: A Software Tool for in Silico Reactant Prediction for Human Cytochrome P450 Enzymes

    No full text
    In silico metabolism prediction requires first predicting whether a specific molecule will interact with one or more specific metabolizing enzymes, then predicting the result of each enzymatic reaction. Here, we provide a computational tool, CypReact, for performing this first task of reactant prediction. Specifically, CypReact takes as input an arbitrary molecule (specified as a SMILES string or a standard SDF file) and any one of the nine of the most important human cytochrome P450 (CYP450) enzymesCYP1A2, CYP2A6, CYP2B6, CYP2C8, CYP2C9, CYP2C19, CYP2D6, CYP2E1, or CYP3A4and accurately predicts whether the query molecule will react with that given CYP450 enzyme. Tests of CypReact, conducted over a data set of 1632 molecules (each considered a “plausible” reactant) show that it is very effective, with a (cross-validation) AUROC (area under the receiver operating characteristic curve) of 0.83–0.92. We also show that CypReact performs significantly better than other reactant prediction tools such as ADMET Predictor and (a reactant-predicting extension of) SMARTCyp, whose average AUROCs are 0.75 and 0.53, respectively. We then applied the learned CypReact models to a previously unseen set of molecules and found that our CypReact did even better and still significantly surpassed the performance of SMARTCyp and ADMET Predictor. These results suggest that CypReact could be an important component of a suite of in silico metabolism prediction tools for accurately predicting the products of Phase I, Phase II, and microbial metabolism in humans. CypReact is available at https://bitbucket.org/Leon_Ti/cypreact

    CFM-ID 3.0: Significantly Improved ESI-MS/MS Prediction and Compound Identification

    No full text
    Metabolite identification for untargeted metabolomics is often hampered by the lack of experimentally collected reference spectra from tandem mass spectrometry (MS/MS). To circumvent this problem, Competitive Fragmentation Modeling-ID (CFM-ID) was developed to accurately predict electrospray ionization-MS/MS (ESI-MS/MS) spectra from chemical structures and to aid in compound identification via MS/MS spectral matching. While earlier versions of CFM-ID performed very well, CFM-ID’s performance for predicting the MS/MS spectra of certain classes of compounds, including many lipids, was quite poor. Furthermore, CFM-ID’s compound identification capabilities were limited because it did not use experimentally available MS/MS spectra nor did it exploit metadata in its spectral matching algorithm. Here, we describe significant improvements to CFM-ID’s performance and speed. These include (1) the implementation of a rule-based fragmentation approach for lipid MS/MS spectral prediction, which greatly improves the speed and accuracy of CFM-ID; (2) the inclusion of experimental MS/MS spectra and other metadata to enhance CFM-ID’s compound identification abilities; (3) the development of new scoring functions that improves CFM-ID’s accuracy by 21.1%; and (4) the implementation of a chemical classification algorithm that correctly classifies unknown chemicals (based on their MS/MS spectra) in >80% of the cases. This improved version called CFM-ID 3.0 is freely available as a web server. Its source code is also accessible online
    corecore